**Final Project Proposal: Optimized High Speed and Low Power Booth Multiplier**

Jiaqi Gu\* (EID: jg68999), Yiqian Zhang+ (EID: yz6834)

Department of Electrical and Computer Engineering, the University of Texas at Austin

\* Email: jqgu@utexas.edu

+ Email: irenezhang@utexas.edu

# Introduction

Multiplication is a critical and unavoidable operation in most of the real-world systems. However, multipliers usually take many cycles to generate outputs, thus may not be able to deliver data to the next computation stage in a timely manner. In addition, long latency of combinational logic within one cycle may lead to overall reduction in maximum clock rate. Latency issue apart, area complexity is also a concern in modern designs such that the multiplier should be compact, otherwise may induce potentially more power dissipation and higher financial cost. Therefore, we are motivated to propose a new multiplier solution that aims to optimize Booth multiplier targeting at lower delay and area complexity.

# Background

Booth multiplier [1], is based on an algorithm designed for fast multiplication. It first re-encodes adjacent bits in one of the multiplicand, aimed at reducing the number of partial products. Then all the partial products will be simply calculated and sign-extended in order to achieve multiplication of two’s complement numbers. Finally, weighted summation will be applied to those partial products to generate the result. Therefore, there are several issues for us to address: (1) how to trade-off between performance benefit and overhead of different radix of Booth algorithm; (2) how to achieve efficient sign extension; (3) how to achieve higher performance in partial product summation; (4) how to reduce the area complexity of the hardware design.

To address the above issues, a number of algorithms and methodologies have been proposed in the literature. Particularly, S. Abraham *et al*. [2] proposes a modified Booth multiplier that halves the number of partial products. S. Dubey *et al*. [3] shows a Wallace Tree multiplier where the partial products are computed by Booth multiplier, and this architecture innovation results in significant reduction in delay. R. D. Kshirsagar *et al*. [4] demonstrates a Wallace-tree-based high throughput multiplier by using a four-stage pipeline.

# Method

Given that our optimization will be mainly based on Booth multiplier combined with a Wallace tree, we will first start with theoretical analysis of it in terms of delay and complexity. Then we will examine different radices of Boot algorithm, to find out the optimal recoding. We will also examine different combinations of adder adopted in Wallace tree, to find out an architecture for faster summation of partial products. Through these exploitation, we may find a better optimized multiplier solution. Lastly, we will implement out proposed design and conduct comparison experiments on different multipliers to validate the effectiveness of our proposed method.

# Reference

* 1. A.D. Booth, “A Signed Binary Multiplication Technique,” *The Quarterly Journal of Mechanics and Applied Mathematics*, vol. 4, no. (2), pp. 236-240, Aug. 2013.
  2. S. Abraham et al., “Study of Various High Speed Multipliers,” in *Proc. ICCCI*, Coimbatore, India, 2015.
  3. S. Dubey et al., “A High Speed Wallace Tree Multiplier Using Modified Booth Algorithm for Fast Arithmetic Circuits,” IOSRJECE, vol. 3, no. 1, pp. 7-11, Sep. 2012.
  4. R. D. Kshirsagar, E. V. Aishwarya, A. S. Vishwanath and P. Jayakrishnan, "Implementation of pipelined Booth Encoded Wallace tree Multiplier architecture," in *Proc. ICGCE*, Chennai, 2013, pp. 199-204.